Exploratory Data Analysis on the Global Terrorism Dataset

Getting Started

# Installing the required packages needed for the Exploratory data anlysis 

#install.packages("ggplot2")
#install.packages("dplyr")
#install.packages("ggmap")
#install.packages("plotly")
#install.packages("pacman")
#install.packages("scales")

# Calling the installed packages
library("scales")
library("plotly")
library("ggplot2")
library("dplyr")
library("ggmap")
library("pacman")

Importing and Cleaning Data.

# Reading the Global Terrorism data.
global_terrorism<-read.csv("globalterrorismdb_0718dist.csv")
# Looking at the dimensions of the global_terrorism data
dim(global_terrorism)
## [1] 181691    135
# We see that it has 181691 rows and 135 columns, but since most of it is not of much use,
# we create a cleaner data frame consisting of the most usable data.
clean<-global_terrorism[c("iyear","country_txt","city","summary","attacktype1_txt","targtype1_txt","success","gname","weaptype1_txt","propextent_txt","longitude","latitude")]
# Removing the NA values form the clean dataset.
clean<-na.omit(clean)
# Summary and head of the clean data.
summary(clean)
##      iyear      country_txt            city             summary         
##  Min.   :1970   Length:177134      Length:177134      Length:177134     
##  1st Qu.:1991   Class :character   Class :character   Class :character  
##  Median :2009   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :2003                                                           
##  3rd Qu.:2014                                                           
##  Max.   :2017                                                           
##  attacktype1_txt    targtype1_txt         success          gname          
##  Length:177134      Length:177134      Min.   :0.0000   Length:177134     
##  Class :character   Class :character   1st Qu.:1.0000   Class :character  
##  Mode  :character   Mode  :character   Median :1.0000   Mode  :character  
##                                        Mean   :0.8881                     
##                                        3rd Qu.:1.0000                     
##                                        Max.   :1.0000                     
##  weaptype1_txt      propextent_txt       longitude            latitude     
##  Length:177134      Length:177134      Min.   :-86185896   Min.   :-53.15  
##  Class :character   Class :character   1st Qu.:        5   1st Qu.: 11.51  
##  Mode  :character   Mode  :character   Median :       43   Median : 31.47  
##                                        Mean   :     -459   Mean   : 23.50  
##                                        3rd Qu.:       69   3rd Qu.: 34.69  
##                                        Max.   :      179   Max.   : 74.63
head(clean)

Observation:

The ‘iyear’ column shows the year when an terrorism incident occurred. The provided data has recorded global terrorism activities starting form the year 1970 till 2017.
The ‘country_txt’ column shows the country where the incident accurred. We have a total 204 countries in the cleaned dataset.
The ‘city’ column shows the city in which the incident accurred.
The ‘summary’ column shows the short summany(headline) on the incident for few of the rows in the dataset.
The ‘attacktype1_txt’ column shows type of attack which took place.The types of attack are categorized into 9 types.
The ‘targtype1_txt’ column shows the group of individuals on whom the attack was targeted. There are 22 targeted groups in the cleaned dataset.
The ‘success’ column indicates weather the attack was successful or not.
The ‘weaptype1_txt’ column shows the kind of weapons used for the attack.It is classified into 12 groups.
The ‘gname’ column shows the group of people who carried out the act of terrorism.
The ‘propextent_txt’ column shows the amount of property damage(capital) occurred. It is broadly classified into 4 groups.

Observing the year data according to different factors

Observation:

We observe how the behavior of different types of global terrorism attacks, using diffent weapon types , targeting different groups of individuals changed over the years.

A Frequency distribution of acts of terrorism across different parts of the globe.


Observations:

From this plot we observe that (in the given dataset) the Iraq is the county with maximum number of successful terrorist attacks. Iraq is followed by Pakistan, Afghanistan and India.

Observing the trend of golbal terorism from 1970-2017

## Warning: Ignoring unknown parameters: binwidth, bins, pad


Observations:

Through the 1st line-plot, we observe that there is a steady increase the number of terrorism activities (note: the data got the year 1993 is missing), followed by a relative decrease an then again followed by a sharp increase till the year 2014.
Note: the behavior of the plot depends on multiple factors like world population and the accuracy of the recorded data and hence one shouldn’t consider this as te only determining factor.

We see that almost a similar trend is observed for the attack type in the 2nd plot in the chunk.

Observing Percent count of various Attack Type.


Observations:

The most common type of terrorism attack is Bombing/Explosion taking up almost half the pie followed by Armed Assault and Assasination.

Observing Percent count of various Target Type.


Observations:

Private Citizens and Property is the most commonly targeted group forllowed by the Military, Police,the general Government and Business.

A better visualization of the terrorism data across the world.

Observation:

We observe that there’s a high density of clustering around the Middle-East, Southern Asia, the Andean Regions of South America, few parts of Central Africa, Southern Europe, and South-Eastern Asia.